Robustness of community structure in networks 
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The discovery of community structure is a common cliallenge in tlie analysis of network data. 
Many methods have been proposed for finding community structure, but few have been proposed 
for determining whether the structure found is statistically significant or whether, conversely, it could 
have arisen purely as a result of chance. In this paper we show that the significance of community 
structure can be effectively quantified by measuring its robustness to small perturbations in network 
structure. We propose a suitable method for perturbing networks and a measure of the resulting 
change in community structure and use them to assess the significance of community structure in a 
variety of networks, both real and computer generated. 



I. INTRODUCTION 

Many networks of scientific interest decompose nat- 
urally into communities or modules, densely connected 
subsets of nodes with only sparser connections between 
them. In many cases communities have been found to 
correspond to behavioral or functional units within net- 
works, such as functional modules in biochemical net- 
works or social groups within social networks. This find- 
ing suggests that in networked systems whose function is 
less well understood we may be able to gain insight by 
discovering and examining their communities (if any), 
and methods for community discovery have, as a result, 
attracted a substantial amount of attention in the recent 
literature in many disciplines P, Q ■ 

Communities are of interest for other reasons as well. 
Their presence can, for example, dramatically alter the 
behavior of dynamical processes on networks [3| (and in- 
deed the observation of dynamical processes has been 
proposed as one possible method of community detec- 
tion ^). Communities can also be used as a basis for the 
reduction or coarse-graining of networks for visualization 
or other purposes d, @. And communities frequently 
display different statistics from the network as a whole, 
indicating that global network statistics such as degree 
moments or correlation functions may potentially fail to 
register important heterogeneities 

A large number of methods for finding communities 
have been proposed in recent years, including divisive 
methods based on betweenness and similar measure s jS) , 
Q , methods based on searching for small cliques [13, [Hf , 
information-theoretic techniques Il2l . statistical infer- 
ence through belief propagation [l3| or maximum like- 
lihood [3], and many others. 

Perhaps the most widely used technique, however, is 
the maximization of the benefit function known as mod- 
ularity I, M, M, lizi, d, [H, H ill , which is (to within a 
multiplicative constant) the difference between the num- 
ber of edges within communities and the expected num- 
ber of such edges under an appropriate null model. Vari- 
ous null models have been used but the commonest by far 
is the standard configuration model [1^ [2^ , which pre- 



serves the degree sequence of the original network but 
otherwise randomizes edge positions. The modularity is 
then maximized over possible divisions of the network, 
the optimal division being taken to be the correct parti- 
tion of the network into communities. 

Unfortunately, exhaustive maximization of the mod- 
ularity is known to be an NP-complete task [23| and 
hence is essentially intractable for all but the smallest of 
networks. In practical implementations of the modular- 
ity method, therefore, approximate heuristics are usually 
employed, such as greedy algorithms U K [l ^. extremal 
optimization [13], simulated anneahng jlSl. Iiol. [20| . or 
spectral methods [2l|. These methods vary in their ef- 
fectiveness and speed, the faster algorithms tending to 
give poorer results while the slower ones can only be ap- 
plied to smaller networks if running time is to be kept to 
reasonable levels. In this paper we employ the spectral 
optimization method introduced in [2l| , which displays a 
reasonable balance between accuracy and speed, but the 
calculations we describe are not tied to this method, or 
even to modularity maximization in general, and could 
be applied to any community detection scheme with only 
minor modifications. 

Despite the large volume of work on community de- 
tection and its applications, one important question re- 
mains largely unaddressed, that of the significance of the 
results. How can we tell when the communities detected 
by one method or another are truly significant and when 
they could be merely the consequence of a chance coin- 
cidence of edge positions in the network? Clear answers 
to this question are crucial if the results of community 
analyses are to carry any real weight. 

The modularity itself was originally proposed as a 
way of answering this question 5]: a network with 
strong community structure will have high modularity 
and hence the value of the modularity can be used as a 
quality function for communities. More recently, how- 
ever, it has been realized that this approach is insuf- 
ficient. Although it is true that networks with strong 
community structure have high modularity, it turns out 
that not all networks with high modularity have strong 
community structure. Indeed, there exist networks that 



2 



most observers would consider to have no community 
structure at all that nonetheless have high modularity. 
Guimera et al. [2^ showed numerically that divisions 
exist of ordinary random graphs that have high mod- 
ularity, even in the limit of large network size, a result 
confirmed in later analytic calculations by Reichardt and 
Bornholdt [IBl ■ The reason for this at first peculiar find- 
ing is actually quite straightforward: the number of pos- 
sible divisions of a network increases extremely fast with 
network size (faster than any exponential), so that al- 
though it is highly improbable that any one division will, 
purely by chance, have high modularity, it is, in the limit 
of large size, very likely that such a division will exist 
among the enormous number of possible candidates. As 
a result, high modularity is only a necessary but not suf- 
ficient condition for significant community structure. 

Several authors have suggested that instead we should 
look for divisions of a network that have significantly 
higher modularity than the random graph [25, 26]. For 
example, one could optimize the modularity for a large 
number of networks drawn from the random graph en- 
semble, calculate the mean fi and standard deviation a 
of those modularity values, and then compare the mod- 
ularity Q of the optimal division of the real network to 
those values, calculating, for instance, a z-score: 

which measures how many standard deviations the real 
modularity is above the mean for the random graph. If 
z 3> 1 then Q is, in a precise sense, significantly greater 
than the modularity of the random graph. 

This approach, however, has a number of problems. 
First, it can generate both false positives and false nega- 
tives. Some networks that do not have strong community 
structure in the traditional sense nonetheless have mod- 
ularity significantly above that of the random graph, as 
shown for example in [23|. Conversely, there are also 
some networks that are widely agreed to show strong 
community structure but whose modularity is not signif- 
icantly greater than the random graph. We give some 
examples of this type of behavior later in this paper. (To 
be fair, such examples appear to be rare, so that a large 
difference in modularities may in some situations be con- 
sidered supporting, though not conclusive, evidence of 
community structure.) 

More importantly, however, the difference in modu- 
larities does not really address the question we want to 
answer. In this paper we argue that the defining property 
of significant community structure is not a high modu- 
larity, but a community structure that is robust against 
small perturbations of the network. If a small change 
in the network — an edge added here, another deleted 
there — can completely change the outcome of our com- 
munity finding calculations then, we argue, the commu- 
nities found should not be considered trustworthy. The 
z-score is not, in general, a good measure of this type of 
robustness or fragility in a network, but there exist other 



measures that, as we will show, appear to work well. 



II. ROBUSTNESS OF COMMUNITY 
STRUCTURE 

An interesting approach to testing the significance of 
community assignments has been proposed by Massen 
and Doye [28[ , who investigated the distribution of modu- 
larity values for a variety of networks, both real and com- 
puter generated, using a simulated annealing technique 
similar to that of Reichardt and Bornholdt [2^ combined 
with a parallel tempering scheme of the type commonly 
used to equilibrate simulations of glassy systems [30,] . As 
a function of the annealing temperature they investigated 
(among other things) the average modularity of divisions 
found, with higher temperatures favoring poor divisions 
(low modularity) and lower temperatures favoring better 
ones (high modularity). 

In low-temperature systems, where only states of high 
modularity are sampled, they found two distinct behav- 
iors. In most real networks they found that the states 
sampled correspond to roughly the same division of the 
network into communities, while in random graphs the 
states sampled correspond to a variety of quite different 
divisions. This suggests that real-world networks typi- 
cally have a clear global modularity maximum with no 
other competitive maxima, while random graphs have 
many competing maxima. In the language of physics, 
the distribution of maxima has a band gap between the 
ground and excited states in the real networks, but no 
band gap in the random graph. (One can also think of 
the system's behavior by analogy with glassy systems, 
which have many competing energy minima, and non- 
glassy ones, which typically do not. Indeed, ideas from 
the theory of spin glasses, in particularly replica symme- 
try, have proved useful in the study of modularity [2^ . 
suggesting that the difference between the community 
structure of random and real-world networks may be con- 
nected with the phenomenon of replica symmetry break- 
ing.) 

One can make use of this observation to identify com- 
munity structure of the kind found in random graphs that 
occurs purely as a result of chance fluctuations: if we ob- 
serve multiple modularity maxima in a network, corre- 
sponding to distinct community assignments and having 
roughly equal height, we can conclude that the assign- 
ments in question are not trustworthy. This approach 
will reliably rule out random graphs themselves — a basic 
task that any significance test must certainly be capa- 
ble of — but it can in principle also rule out other cases 
and does so in a natural way, since any network that has 
many different community assignments of roughly equal 
merit can reasonably be said not to show clear commu- 
nity structure. 

This approach provides only a way to rule out can- 
didate assignments. It allows us firmly to reject some 
possibilities because of the structure of the modularity 
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maxima, but we can never guarantee that an observed 
community assignment is significant solely on the basis of 
this test. Having multiple competing modularity maxima 
is a good indicator that the community structure given 
by the highest of those maxima is not trustworthy, but it 
is also possible that chance fluctuations could produce a 
network in which the highest maximum is substantially 
higher than any other even if the network has no under- 
lying community structure. In this respect, the method 
is similar to other significance tests in statistics. Signif- 
icance tests only ever reject hypotheses (or fail to reject 
them) but can never absolutely confirm a hypothesis to 
be correct. 

Massen and Doye proposed to implement tests of this 
kind by using their simulated annealing method to find 
all or a representative subset of the assignments having 
greatest modularity in a network and then see if they 
have similar community structure. Simulated anneal- 
ing, however, is computationally costly and is usually not 
the optimization method of choice. And the approach of 
Massen and Doye cannot easily be generalized to other 
optimization methods, such as the spectral method. We 
propose, therefore, a different approach based on network 
perturbations. 

Small changes to a network — the addition or removal 
of a few edges, for example — will in general result in 
small changes to the value of the modularity for partic- 
ular partitions of the network. In a network with many 
closely competitive modularity maxima, this can change 
the relative heights of the maxima with the result that 
the global optimum may shift from one maximum to an- 
other. In a network with only a single optimum on the 
other hand this cannot happen, prevented in effect by 
the presence of the band gap. Thus, a simple way to de- 
termine whether the network we are looking at has just 
a single optimum is to perturb the network slightly and 
observe the resulting change in the optimal partition. 

This idea is the basis for our proposed method. In ef- 
fect we turn the question of the significance of a division 
of a network into a question about the robustness of that 
division against perturbations, and the latter question 
can in practice be answered more easily. Our method 
also has the substantial advantage of being entirely ag- 
nostic about the way we discover our communities. We 
are not even required to use a modularity optimization 
technique — any technique that reliably finds community 
structure where present will do. We describe our method 
in detail in the following sections. 



III. QUANTIFICATION OF NETWORK 
ROBUSTNESS 

Our approach has two key components: perturbation 
of the network and quantification of the resulting change 
in the community structure. We describe these two com- 
ponents in turn. 



A. Network perturbation 

We wish to specify a method for perturbing an arbi- 
trary network by an arbitrary amount. In order to make 
comparison of communities straightforward, we restrict 
our perturbed networks to having the same numbers of 
vertices and edges as the original unperturbed network — 
only the positions of the edges will be perturbed. Fur- 
thermore, we desire that a network perturbed only a 
small amount has just a few edges moved, while a max- 
imally perturbed network becomes completely random 
and uncorrelated with the original. 

There are a number of ways in which this could be 
achieved but one of the simplest is the following. We 
define a random graph with n vertices and m edges in 
standard fashion by distributing the edges between ver- 
tex pairs such that the probability of any particular edge 
falling between vertices i and j is eij/m. This implies 
that the expected number of edges between i and j will 
be equal to e^. (Technically, the diagonal elements of 
Cij are different: they are equal to twice the expected 
number of edges — the extra factor of two allows for the 
fact that there are two ways of choosing a vertex pair if i 
and j are distinct but only one way if i and j are equal.) 

This definition still leaves us a good amount of free- 
dom since we haven't chosen the form of . Except for 
the constraint that the total number of edges equals m so 
that i ^ij — ™j ^'I'S ^-t liberty to make any choice 
we wish, but the obvious candidate is the so-called con- 
figuration model, which is also the null model normally 
used in the definition of the modularity [H] and the ran- 
dom graph model against which values of the modularity 
are usually compared 26] . The expected number of edges 
between vertices in the configuration model is 

where ki is the degree of vertex i in the original network. 

Now we interpolate stochastically between our original 
network and this random graph by "rewiring" (i.e., mov- 
ing) edges. Specifically, we go through each edge in the 
original network in turn and with probability a we re- 
move it and replace it with a new edge between a pair 
of vertices (i, j) chosen randomly with probability eij/m. 
Otherwise, with probability 1 — a, we leave the edge as 
it is. 

If a = 0, no edges are moved and this process preserves 
our original network. If a = 1 all edges are moved and the 
process generates a random graph drawn from the model 
ensemble. And for values of a in between it generates 
networks in which some of the edges retain their original 
positions while others are moved to positions drawn from 
the random ensemble. 

With the choice ([2]) for e^, the expected number of 
edges between vertices i and j in our perturbed network 
is 

e^ = (l-«)A,+a^. (3) 
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where Aij is an element of the adjacency matrix 

^ _ f 1 if an edge connects node i and j, 
^■'^ I otherwise. ^ ' 

Then the expected degree of vertex i is 

ih) = J2 ^'-3 = (1 - ") H + H % 

i j J 

= (1 — a)ki + a- — 2m — ki, (5) 

where we have made use of Aij = fcj and kj = 2m. 

Thus our perturbation scheme generates networks that 
not only have the same number of edges as the original, 
but in which the expected degrees of vertices are the same 
as the original degrees [il ]. 



axiomatically zero in the null model. Such adjusted in- 
dices have the disadvantage, however, of non-locality [s^: 
the distance between two community assignments that 
differ only in one region of the network depends on how 
the rest of the network is partitioned. 

An alternative approach is cluster matching, as em- 
bodied in measures such as the van Dongen metric and 
the classification error. These measures attempt to de- 
termine the best match for each cluster in C to one of 
the clusters in C". Suppose our two community assign- 
ments C and C" are composed of K and K' commu- 
nities respectively. The individual communities we will 
denote Ci . . . Ck and C[ . . . C'j^, . Then let and n'f., 
be the size of communities Ck and C^, and n^fc' be the 
number of vertices common to communities Ck and C^, 
(i.e., rifcfc/ = I Cfc nC^/ h. Then the normalized van Dongen 
metric is defined by [SJI 



B. Quantifying differences in community structure 

The second component of our calculation is the com- 
parison of the optimal division of the perturbed network 
to the optimal division of the original network, to see 
if the community structure has changed significantly. A 
number of methods for measuring similarities or differ- 
ences between partitions of a network have been pro- 
posed in the past. They can be divided roughly into three 
groups: methods based on pair counting, methods based 
on cluster matching, and information theoretic methods. 
We begin by reviewing some of these before we discuss 
our choice, the variation of information. Our discussion 
follows that of Meila [3l|. 

Let C and C be two divisions of the same network into 
communities. We will refer to such divisions as commu- 
nity assignments. 

Measures of the similarity or difference between two 
community assignments based on pair counting focus on 
the number of pairs of vertices that are in the same or 
different communities in both assignments. Such mea- 
sures include the Jaccard coefficient and the Rand index. 
We define the following four numbers: 

floo = pairs in different communities in both C and C", 
ail = pairs in the same communities in both C and C", 
aoi = pairs in different (same) communities in C (C), 
aio = pairs in same (different) communities in C (C). 

Then, for example, the unadjusted Rand index [s^] is 
defined to be the ratio of the number of pairs clustered 
in the same way in both assignments to the total number 
of pairs thus: 



D{C,C') = 1- — 
2n 



R{C, C) = 



Oil + ooo 



aio + floi + floo + On 



(6) 



The Rand index is also sometimes used in an adjusted 
form in which a null-model expectation value is sub- 
tracted from the unadjusted index to give a value that is 



■ K K' 

Emaxrifefc' + maxnfefc/ 
fe' ^ k 

.k=l k' = l 



(7) 



Note that such measures ignore any subdivisions of a 
community that is never chosen as a match to a com- 
munity in the other assignment. For example, suppose 



C^{{a,b,c},{d,eJ,g}}, (8a) 
C = {{a,b,c},{d,e},{f,g}}, (8b) 
C"- {{a,6,c},{4,{e},{/,g}}. (8c) 

Under the van Dongen scheme D(C,C') = D{C,C"), 
although many would claim (and most other measures 
agree) that C is more similar to C than to C" . 

A third class of measures for comparing commu nity 
assignments is based on information theoretic ideas [35|. 
In measures such as these, we regard our community as- 
signments as "messages" and consider the Shannon in- 
formation content of these messages. The most common 
way to do this is to define Xi to be the label of the com- 
munity that vertex i belongs to in C and yi to be the 
community it belongs to in C . Then the messages con- 
sist simply of the ordered sets {xi} and {yi}. If one 
knows the joint distribution from which the a;'s and j/'s 
are drawn one can then calculate various standard infor- 
mation measures. The usual assumption is that the joint 
distribution is equal simply to that of the observed com- 
munity assignment. In other words, x and y are assumed 
to be values of random variables X and Y with joint dis- 
tribution P(X = x,Y = y) = n^y/n, where n is the total 
number of vertices in the network. This immediately im- 
plies also that P{X = x) = n^/n and P{Y = y) — n'y/n. 

In a slight abuse of terminology, we can then define 
the mutual information between the assignments C and 
C to be equal to the mutual information between the 
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corresponding random variables: 
I{C-C')^I{X-Y) 

=f f:p(.,,)iog.^(^'^) 

a:— 1 y—1 



P{x)P{yy 



(9) 



where we use the shorthand notation P{x) to denote 
P{X ~ x) and similarly for the other distributions. 
(Within physics, researchers have traditionally used the 
natural logarithm in expressions such as (O, while in 
computer science the logarithm base 2 is more common. 
The choice makes only the difference of a multiplicative 
constant, however, and has no effect on any of our re- 
sults.) 

The mutual information measures how much informa- 
tion we learn about C" if we know C. If C and C are 
identical, then we learn everything about C' from C . 
If they are entirely uncorrelated then we learn nothing. 
One way to express this is to make use oi P{x,y) = 
P{x\y)P{y) to write 



xy 



H{X)~H{X\Y), 



(10) 



where H{X) is the information (or entropy) of X and 
H{X\Y) is the conditional entropy, i.e., the additional 
information needed to describe X once we know Y . Thus 
if Y tells us nothing about X the two terms are equal 
and I{X] Y) is zero. In essence the mutual information 
tells us the same thing as the conditional entropy, but 
the mutual information is symmetric in X and Y where 
the conditional entropy is not, which makes the former a 
more attractive measure of distance than the latter. 

The mutual information alone, however, is not a good 
measure of the difference between our community assign- 
ments. Consider, for example, the three example assign- 
ments of Eq. ([5]). In this case the conditional entropies 
H{C\C') and H{C\C") are both zero, because given the 
community assignments C and C" (and the appropriate 
mapping of community labels from one assignment to the 
other) we can deduce the assignment C. (The mapping of 
labels must be given, since the labels are arbitrary and we 
do not want our measure to register a difference between 
two assignments that in fact differ only in a permutation 
of the labels.) Therefore /(C, C) = /(C, C") = H{C) 
in this case, which is clearly not a useful answer. This 
problem is usually dealt with by normalizing the mutual 
information. There are a number of ways of accomplish- 
ing this but, for example, one can define 



In 



,(C, C") = 



2/(C, C) 



H{C)+H{C') 



(11) 



A variant of this measure has been used by 
Danon et al. Q to define standardized tests for the per- 
formance of community finding algorithms. Although the 
measure works, it is quite difhcult to interpret, particu- 
larly in the normalized form, which makes it hard to give 



a simple statement about what the values mean (other 
than to say they get larger as community assignments 
become more similar). 



C. Variation of information 

In our work we make use of a different information 
theoretic measure, the variation of information [sil Issl . 
l36j . The variation of information is defined by 



V{C,C') = V{X,Y) 

= H{X) + H{Y) -2I{X;Y) 
= H{X\Y) + H{Y\X) 
P{x,y) 



Pix, y) log -j^ ^(^' 



P{x,y) 



xy 



xy 



P{x) ■ 
(12) 



The variation of information is the sum of the informa- 
tion needed to describe C given C" and the information 
needed to describe C given C. It has a number of de- 
sirable properties that other measures lack. It is a true 
metric on the space of community assignments, having 
all the properties of a proper distance measure. It is also 
a local measure in the sense described above and it re- 
turns the intuitively correct answer for the example of 
Eq. dl]), that V{C, C") > V{C, C). 

The maximum value of the variation of information is 
log n, which is achieved when the community assignments 
are as far apart as possible, which in this case means that 
one of them places all the nodes together in a single com- 
munity while the other places each node in a community 
on its own. The maximum value increases with n be- 
cause larger data sets contain more information, but if 
this property is undesirable one can simply normalize by 
log n, as we do in the calculations presented here. In fact, 
since we will always be comparing networks of the same 
size, the normalization is irrelevant anyway. 



IV. METHODS 

We now have all the components we need to describe 
our method as applied to a given network. First, we find 
the community assignment C that maximizes the mod- 
ularity of the network, or the best approximation to it 
given the optimization algorithms available. Second, we 
perturb the network as described in Section [ill Al to cre- 
ate a new network, find the optimal community assign- 
ment C for that perturbed network, and measure the 
variation of information between C" and C. We repeat 
this second step many times to derive an average value for 
the variation of information, and repeat the entire calcu- 
lation for a range of different values of the perturbation 
parameter a. 

For comparison, we also perform the same set of cal- 
culations on a random graph drawn from a configuration 
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model with the same degree sequence as the original net- 
work. Then we repeat the process for several more such 
random graphs and average the values of the variation of 
information. 

The computer time required to complete the calcula- 
tions depends on the method used to optimize the mod- 
ularity, the number of random graph samples taken, and 
the number of different values of a. In our calculations, 
as mentioned above, we use the spectral optimization 
method of [2l[, which is reasonably fast, though certainly 
not the fastest available, and average over 10 or 100 ran- 
dom graphs depending on network size for each of 40 
different values of a from to 1. The complete calcu- 
lation for the largest network studied here, with nearly 
5000 vertices, took about a day on a standard desktop 
computer. 
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V. RESULTS 

As a first demonstration of the method, we have ap- 
plied it to a set of computer generated networks of a type 
proposed in Q and used widely in the evaluation of com- 
munity detection algorithms. These networks consist of 
128 vertices divided into 4 communities of 32 nodes each. 
Each vertex pair is connected by an edge with one of two 
different probabilities, one for pairs in the same group 
and one for pairs in different groups, with values cho- 
sen so that the expected degree of each vertex remains 
fixed at 16. As the average number b of between-group 
connections per vertex is increased from zero, the com- 
munity structure in the network, stark at first, becomes 
gradually obscured until, at the point where between- 
and within-group edges are equally likely, the network 
becomes a standard Poisson random graph with no com- 
munity structure at all. 

Figure [1] shows the results of the application of our 
analysis method to graphs of this type. The figure shows 
the value of the normalized variation of information as a 
function of the parameter a that measures the amount of 
perturbation. As we can see, the variation of information 
starts at zero when a = 0, as we would expect for an 
unperturbed network, rises rapidly, then levels off as a 
approaches its maximum value of 1. Also shown is the 
curve for a random graph null model of the type described 
above. 

For large values of b, such as 6 = 10, the curve of 
the variation of information is essentially identical to 
that of the null model, indicating that whatever com- 
munity structure has been found by the algorithm is no 
more robust against perturbation than that of a random 
graph. But as b gets smaller the variation of informa- 
tion increases slower as a function of a and the curves 
depart significantly from the null model, indicating that 
the community structure discovered by the algorithm is 
relatively robust against perturbation. 

As an aid to the interpretation of the results, we have 
also included in the figure (and in all subsequent similar 



FIG. 1: The variation of information as a function of the 
perturbation parameter a for the 128-node four-community 
test networks described in the text (100 networks per point). 



figures) horizontal lines corresponding to the value the 
variation of information would take if we were to ran- 
domly assign 10% and 20% of the vertices to different 
communities. The fact that the curves of variation of in- 
formation cross these lines at larger values of a in some 
cases than others indicates that the community struc- 
ture is more or less robust to perturbation. Indeed, one 
could simply quote the values of a at which the cross- 
ings occur as a single scalar measure of robustness, but 
to do so can mean missing interesting structure present 
in the full curves, so we have avoided this approach in 
our calculations. 

Turning now to real-world networks, we have tested 
our method on a variety of examples including social, 
technological, and biological networks. A selection of re- 
sults are shown in Fig. [2l Some summary statistics for 
the same networks are given in Table HI 

Figure [2^ shows the curve of variation of information 
as a function of a for one of the best studied examples 
of community structure in a social network, the "karate 
club" network of Zachary [s^l ■ (The karate club has be- 
come so common an example in this context that it has 
almost come to the point where no publication about 
community structure could be complete if it failed to dis- 
cuss this network.) The vertices in this network repre- 
sent members of a karate club at a US university in the 
1970s and the edges represent friendship between mem- 
bers based on independent observations by the experi- 
menter. The network is widely believed to show strong 
community structure and repeated studies have upheld 
this view. 

The black points (squares) in the figure show the vari- 
ation of information for the real network while the red 
points (triangles) show the results for the equivalent ran- 
dom graph. It is clear in this case that the community 
structure discovered in the real network is substantially 
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FIG. 2: The variation of information as a function of the perturbation parameter a for six real-world networks as described in 
the text, along with equivalent results for the corresponding random graphs. 



more robust against perturbation than that of the ran- 
dom graph. For example, the curve for the real net- 
work crosses the line representing reassignment of 20% 
of the vertices close to the point where a — 0.2. Speak- 



ing loosely, we can say that about 20% of the edges must 
be rewired before 20% of the vertices move to different 
communities. For the random graph, on the other hand, 
only about 5% of the edges need be rewired to reach this 
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point. 

A contrasting situation is seen in Fig.[2b, which shows 
results for another social network, a network of friend- 
ships among a group of first-year university students 
at the University of Groningen in the Netherlands [sst - 
Data for this network were collected by circulating ques- 
tionnaires among members of the group; edges between 
pairs of students indicate that at least one member of 
the pair stated either that they were friends or that they 
had a "friendly relationship." Despite the similar nature 
of this network and the karate club network (both are 
networks of friendship among university students), the 
results of the analysis are quite different. In the Gronin- 
gen network, as Fig. [lb shows, there is essentially no 
difference between the variation of information for the 
real network and the corresponding random graph. The 
community structure algorithm does detect some struc- 
ture in the network, finding four communities of sizes 5, 
7, 9, and 11 vertices respectively and a respectable modu- 
larity score of 0.368, but our robustness analysis indicates 
that this structure is not significant and therefore should 
probably not be taken as indicative of the presence of 
any real communities in the network. 

Our next two examples are both biological networks. 
The first (Fig. [2t) represents the structure of a pro- 
tein (an immunoglobin) , with the vertices representing 
a-helices and /3-sheets and an edge between any two that 
are less than lOA apart [39[. The second (Fig. [2]i) rep- 
resents known portions of the metabolic network of the 
nematode C. Elegans, with vertices representing metabo- 
lites and edges representing metabolic reactions [ioj . 
Again the two networks show contrasting behaviors. The 
community structure in the protein network displays sub- 
stantial robustness against perturbations, with a wide 
gap between the variation of information curves for the 
true network and the random graph. A value of the vari- 
ation of information equivalent to the randomization of 
20% of the vertices is not reached until a perturbation 
strength of around a — 0.3. The metabolic network by 
contrast reaches the same point around a = 0.05, not 
much better than the equivalent random graph. The 
curve of variation of information for the metabolic net- 
work does however remain distinct from that of the corre- 
sponding random graph for higher values of a, indicating 
that some portion of the community structure found is 
relatively robust. 

Our last two examples are technological networks, an 
electronic circuit [4l|, |42| and a network representation of 
the power grid of the western United States [i^]. Both 
of these networks show weak community structure simi- 
lar to that of the metabolic network, with a variation of 
information that increases rapidly with a at first, indi- 
cating that much of the observed structure is quite fragile 
to perturbation, though the curves again remain distinct; 
we conclude that the networks show some community 
structure, even if the effects are not strong. 

Now compare these results with those given in Table H] 
The final column of the table gives a z-score for each 



network 


modularity 


a-score 


Test 6 = 6 


0.373 


21.0 


Test fe = 7 


0.311 


11.1 


Test 6 = 8 


0.248 


2.63 


Test 6 = 9 


0.217 


-2.04 


Test 6 = 10 


0.210 


-2.99 


Karate club 


0.419 


1.77 


University students 


0.368 


-0.19 


Protein structure 


0.763 


24.5 


C. Elegans metabolic 


0.434 


25.4 


Electronic circuit 


0.805 


31.2 


Power grid 


0.925 


100.8 



TABLE I: Maximum modularity and ^-scores for each of the 
networks studied here. The first five lines of results are aver- 
ages over computer-generated random networks as described 
in the text. The final six are real- world examples. 



network calculated as described in the introduction (see 
Eq. (dl)). The comparison with the curves for variation 
of information is an interesting one. Five of the six net- 
works have positive z-scores, but not all of the scores are 
large enough to make the results statistically significant. 
The most common rule of thumb is that measurements 
are significant if they lie more than two standard devia- 
tions from the mean of the null model, i.e., if z > 2. By 
this rule, neither of our social networks have significant 
community structure, a surprising conclusion given that 
it is universally accepted that the karate club network 
has strong community structure, confirmed by repeated 
studies using many methods, and our variation of infor- 
mation calculation confirms this also. For the network of 
university students, on the other hand, the z-score and 
our calculations concur, both indicating that the commu- 
nity structure found is not significant, also a troubling 
result, since it implies that a low z-score may correspond 
either to strong community structure or to none at all. 

The remaining four networks all have very large z- 
scores; the smallest of them is 24.5 and an observation 
twenty-four standard deviations from the mean will be 
considered significant by essentially any standard. Cu- 
riously, however, there seems to be little correlation be- 
tween the z-scores and the robustness of the community 
structure. The highly robust protein structure network, 
for instance, has the lowest z-score of the four, while the 
power-grid — one of the networks we concluded to have 
only rather weak community structure — has a spectacu- 
lar z = 100.8. Overall, therefore, it appears that while 
z-scores for modularity values probably do give some in- 
dication of the strength of community structure, they are 
in general unreliable and should not be trusted unless 
backed up by other calculations, such as those presented 
here. 
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VI. CONCLUSIONS 

In this paper we have examined measures of signifi- 
cance for network community structure that address the 
question of when communities found in a network can 
be considered behevable, and could not reasonably have 
been the result of chance fluctuations in network struc- 
ture. We have argued that high modularity scores, the 
conventional measure of significance, have less discrim- 
inatory power than measures that quantify the robust- 
ness of community assignments to network perturbation. 
We have proposed a method for perturbing networks 
and a measure of the robustness under such perturba- 
tions based on the information-theoretic distance metric 
known as the variation of information. In applications 
to both real and computer-generated example networks, 
our method appears able to distinguish successfully and 
clearly between examples that show strong community 
structure and examples that do not. 

In considering future directions for research, we note 



that all of the calculations presented here focus on the 
quality of partitions of an entire network. It is possi- 
ble that there might be significant community structure 
in one part of a network and not in another, and were 
this the case one would like to be able to detect it. The 
methods described here could potentially be useful for 
this type of investigation: one can ask whether some 
communities in a network are robust under perturbation 
while others are not. The global variation of information, 
however, cannot reveal this type of distinction and more 
detailed local measures are needed. We look forward to 
further developments in this area. 



Acknowledgments 

We thank Jorg Reichardt for useful discussions. This 
work was funded in part by the National Science Foun- 
dation under grant DMS-0405348 and by the James S. 
McDonnell Foundation. 



M. E. J. Newman, Detecting community structure in net- [13 
works. Eur. Phys. J. B 38, 321-330 (2004). 
L. Danon, A. Diaz-Guilera, J. Duch, and A. Arenas, [14 
Comparing community structure identification. J. Stat. 
Mech. P09008 (2005). 

A. Arenas, A. Di'az-Guilera, and C. J. Perez- Vicente, 

Synchronization reveals topological scales in complex [15 

networks. Phys. Rev. Lett. 96, 114102 (2006). 

S. Boccaletti, M. Ivanchenko, V. Latora, A. Pluchino, 

and A. Rapisarda, Detection of complex networks modu- [16 

larity by dynamical clustering. Phys. Rev. E 75, 045102 

(2007). 

M. E. J. Newman and M. Girvan, Finding and evaluat- [17 
ing community structure in networks. Phys. Rev. E 69, 
026113 (2004)'. 

A. Arenas, J. Duch, A. Fernandez, and S. Gomez, Size [18 
reduction of complex networks preserving modularity. 
Preprint |arXiv:physics/0702015| (2007) . 
M. Newman, Modularity and community structure in [19 
networks. Proc. Natl. Acad. Sci. USA 103, 8577-8582 
(2006). 

M. Girvan and M. E. J. Newman, Community structure [20 

in social and biological networks. Proc. Natl. Acad. Sci. 

USA 99, 7821-7826 (2002). [21 

F. Radicchi, C. Castellano, F. Cecconi, V. Loreto, and 
D. Parisi, Defining and identifying communities in net- 
works. Proc. Natl. Acad. Sci. USA 101, 2658-2663 [22 
(2004). 

I. Derenyi, G. Palla, and T. Vicsek, Clique percolation 
in random networks. Phys. Rev. Lett. 94, 160202 (2005). 

G. Palla, I. J. Farkas, P. PoUner, I. Derenyi, [23 
and T. Vicsek, Directed network modules. Preprint 
arXiv:physics/0703248 (2007). 
M. Rosvall and C. T. Bergstrom, An information- [24 
theoretic framework for resolving community structure 
in complex networks. Proc. Natl. Acad. Sci. USA 104, 
7327 (2007). 



M. Hastings, Community detection as an inference prob- 
lem. Phys. Rev. E 74, 035102 (2006). 
A. Clauset, M. E. J. Newman, and C. Moore, Structural 
inference of hierarchies in networks. In Proceedings of the 
23rd International Conference on Machine Learning, As- 
sociation of Computing Machinery, New York (2006). 
M. E. J. Newman, Fast algorithm for detecting com- 
munity structure in networks. Phys. Rev. E 69, 066133 

(2004) . 

A. Clauset, M. E. J. Newman, and C. Moore, Finding 
community structure in very large networks. Phys. Rev. 
E 70, 066111 (2004). 

J. Duch and A. Arenas, Community detection in complex 
networks using extremal optimization. Phys. Rev. E 72, 
027104 (2005). 

R. Guimera and L. A. N. Amaral, Functional cartogra- 
phy of complex metabolic networks. Nature 433, 895-900 

(2005) . 

A. Medus, G. Acuna, and C. O. Dorso, Detection of com- 
munity structures in networks via global optimization. 
Physica A 358, 593-604 (2005). 

J. Reichardt and S. Bornholdt, Statistical mechanics of 
community detection. Phys. Rev. E 74, 016110 (2006). 
M. E. J. Newman, Modularity and community structure 
in networks. Proc. Natl. Acad. Sci. USA 103, 8577-8582 

(2006) . 

T. Luczak, Sparse random graphs with a given degree se- 
quence. In A. M. Frieze and T. Luczak (eds.). Proceedings 
of the Symposium on Random Graphs, Poznan 1989, pp. 
165-182, John Wiley, New York (1992). 
M. MoUoy and B. Reed, A critical point for random 
graphs with a given degree sequence. Random Structures 
and Algorithms 6, 161-179 (1995). 

U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoe- 
fer, Z. Nikoloski, and D. Wagner, On finding graph clus- 
terings with maximum modularity. In Proceedings of the 
33rd International Workshop on Graph-Theoretic Con- 



10 



cepts in Computer Science, Lecture Notes in Computer 
Science, Springer, Berlin (in press). 

[25] R. Guimera, M. Sales-Pardo, and L. A. N. Amaral, Mod- 
ularity from fluctuations in random graphs and complex 
networks. Phys. Rev. E 70, 025101 (2004). 

[26] J. Reichardt and S. Bornholdt, When are networks truly 
modular? Preprint cond-mat/0606220 (2006). 

[27] C. P. Massen and J. P. K. Doye, Identifying "communi- 
ties" within energy landscapes. Phys. Rev. E 71, 046101 
(2005). 

[28] C. P. Massen and J. P. Doye, Thermodynamics of 
community structure. Preprint arXiv:cond-mat/0610077 
(2007). 

[29] J. Reichardt and S. Bornholdt, Statistical mechanics of 

community detection. Phys. Rev. E 74, 016110 (2006). 
[30] D. J. Earl and M. W. Deem, ParaUel tempering: Theory, 

applications, and new perspectives. Phys. Chem. Chem. 

Phys. 7, 3910-3916 (2005). 
[31] M. Meila, Comparing clusterings-an information based 

distance. Journal of Multivariate Analysis 98, 873-895 

(2007). 

[32] W. M. Rand, Objective criteria for the evaluation of clus- 
tering methods. Journal of American Statistical Associ- 
ation 66, 846-850 (1971). 

[33] M. Meila, Comparing clusterings: an axiomatic view. In 
ICML '05: Proceedings of the 22nd International Con- 
ference on Machine Learning, pp. 577-584, ACM Press, 
New York, NY, USA (2005). 

[34] S. V. Dongen, Performance criteria for graph clustering 
and Markov cluster experiments. National Research In- 
stitute for Mathematics and Computer Science in the 
Netherlands (2000). 



[35] T. M. Cover and J. A. Thomas, Elements of Information 
Theory, Second Edition. Wiley-Interscience, New York, 
NY, USA (2006). 

[36] M. Meila, Comparing clusterings. Technical report. Uni- 
versity of Washington, statistics 418 (2002). 

[37] W. W. Zachary, An information flow model for conflict 
and fission in small groups. Journal of Anthropological 
Research 33, 452-473 (1977). 

[38] M. A. J. van Duijn, E. P. H. Zeggelink, M. Huisman, 
F. N. Stokman, and F. W. Wasseur, Evolution of sociol- 
ogy freshmen into a friendship network. J. Math. Sociol. 
27, 153-191 (2003). 

[39] R. Milo, S. Iztkovitz, N. Kashtan, R. Levitt, S. Shen-Orr, 
I. AyzenShtat, M. Sheffer, and U. Alon, Superfamilies of 
evolved and designed networks. Science 303, 1538-1542 
(2004). 

[40] H. Jeong, B. Tombor, R. Albert, Z. N. Ohvai, and A. L. 
Barabasi, The large-scale organization of metabolic net- 
works. Nature 407, 651-654 (2000). 

[41] F. Brglez, D. Bryan, and K. Kozminski, Iscas'89 bench- 
marks. Proc. IEEE Int. Symp. Circuits Syst. pp. 1929- 
1934 (1989). 

[42] R. Cancho, C. Janssen, and R. Sole, Topology of technol- 
ogy graphs: Small world patterns in electronic circuits. 
Phys. Rev. E 64, 046119 (2001). 

[43] D. J. Watts and S. H. Strogatz, Collective dynamics of 
'small-world' networks. Nature 393, 440-442 (1998). 

[44] Note that the perturbed network may have a small num- 
ber of isolated nodes. We do not discard these nodes, 
since that would make the perturbed network a differ- 
ent size from the original; instead we assign each isolated 
node to its own community. 



